As of 2016-02-26, there will be no more posts for this blog. s/blog/pba/
Showing posts with label Google Webmaster Tools. Show all posts

Sadly, Google Webmaster Tools is going to remove Subscriber Stats, which I read on a weekly basis. Even though Google said there are alternatives or replacements available, such as FeedBurner:

Subscriber stats reports the number of subscribers to a sites RSS or Atom feeds. This functionality is currently provided in Feedburner, ...

Well, the fact is it is not, I have never read the exactly same statistics (will explain why not the same soon) in FeedBurner, clearly the poster, Jonathan Simon, Webmaster Trends Analyst, has no idea what information their productsSubscriber stats and FeedBurner Subscribers tabhave provided and are based. At least, not familiar with altogether.

Please, allow me to explain. Firstly, look at Subscriber stats (before it gets wiped):


You can see there are two entries, one is Atom feed and the other is RSS feed, basically by Blogger's default setting. The types may not be same if you redirect to other feed services. You don't see such information in detail in FeedBurner:


The subscriber counts of two feeds are included in Google Feedfetcher, and that's not I want to have and it shows you that FeedBurner is not a replacement and barely you can mark it as an alternative. FeedBurnder does not give you the information you have in Webmaster Tools, it's a summation with different sources. From the explanation:

Google Feedfetcher

Feedfetcher is how Google grabs RSS or Atom feeds when users subscribe to them in Google Reader or iGoogle. Subscriber counts include Google Reader and the iGoogle. Feedfetcher collects and periodically refreshes these user-initiated feeds, but does not index them in Blog Search or Google's other search services.

At first glance, it seems to say that it is same data as the one in Subscriber stats, but it is not, because that's not what you see. 62 != 21+3. Secondly, again, it's a summation, I would like to have more detail.

As you can see in the screenshot, it's the old FeedBurner interface, I should've switched to new interface and see if the information is there. Wait, where is the link to switch interface version?

It'd gone and I didn't notice until now. They probably had dropped the development some time ago and didn't even bother to post an announcement on FeedBurner's blog. Oh, yea, FeedBurner does have a blog, last post was published in October, 2010, you figure out right?

Why 62 != 21 + 3?

First of all, you need to know where are 21 and 3 from. They are statistics provided by Google Feedfetcher bot in User Agent. If you want to get the number for your Blogger blog, you need to redirect feed to a piece of code to intercept the user agent, then redirect it (again) to the real feed, for instance /feeds/posts/default?orderby=update, which I think actually is where FeedBurner gets your content.

The bottom line is, if FeedBurner does not provide such detail information, you need to have a intermediate server for catching those information in order to get individual numbers instead of just a summation.

Back to the question: Why do my blog has 62 subscribers in FeedBurner, when Webmaster Tools only reports 21 + 3.

This is because I have other Blogger blogs' feeds redirected to the same FeedBurner feed after I imported them into this blog.

Now, you should be aware of the problem are:
  1. You can't distinguish the subscribers of different types of feed.
  2. You can't tell the subscribers apart if they are redirected from different sources.
These two problems are fundamentally the same problem. FeedBurner sums up all subscriber counts by Feedfetcher, direct or redirected requests, with query string or without query string. As long as the redirected request made to same burned feed URL, they are all summed up as one number.

For people who don't care, it's fine with you, you are satisfied with a big chuck of number. I care more about this kind of detail. I want to know what type of feed is being subscribed to and how many are from my old blogs' feeds. Once Webmaster Tools removes Subscriber stats, I can't know about it unless I make a simple script to record the number, it's not hard, but i do not want to do that.

In fact, it only takes a few essential lines on Google App Engine. I wonder if there is a service which redirects to specific location and logs all HTTP headers for incoming requests.

However, there is a way to get that Atom feed subscriber count which I have known a long time. I can search my blog in Google Reader, I get exactly 21 subscribers reported with my blog's feed. The problem is the searching function is somewhat slow and it's strange for me to do such task every week.

[edited 2012-04-29T16:05:52Z: An easier way to get the subscriber count in Google Reader is to subscribe to the feed and gets the count via Feed settings... View details and statistics, guess I have to shamelessly subscribe to my own blog feed for the count.]

To be fair, I am not surprised that poster doesn't know about this (I assume). Google has too many products already, it's hard to know every bit. If you test their employees, I am sure most of them can't even name a half of all products which are still in active development. In my case, you need to have such experience to know there is some details missing in FeedBurner's report.

My conclusion is FeedBurner's Subscribers tab is not a replacement of Subscriber stats of Webmaster Tools. Unfortunately, I will live with it.

Beside this issue, other removals, such as the generation of robots.txt, I have no problem of that, because I never used that. But some may find that handy, maybe they should open source'd that part into a standalone page.

The last one is the Site Performance, I had used it once. I did want to see more from it, but I didn't know why there was no new data. Since Google Analytics has started to provide much more thorough data, down to each single page, which I have posted a few days ago.

I always want to know who links to my blog, so I check referrer data in Blogger Stats and Google Analytics report, also set up Google Alerts. I even search for this blog's domain name to see if there is any new hits. (Use "Past 24 hours" time range, it's very useful)

But it doesn't seem to be enough for me. They always seem to be missing some links from those methods, the Alerts hasn't even got me anything for a long time.

In Webmaster Tools, you can download a CSV of link-ins by clicking a button "Download more sample links" (so, this is not complete?) in Your site on the web / Links to your site / Who links the most "More ". (Lost?)

It is a list of external links which has links to your site. Since it is long, there is no humanly way to know which are new links.

So, I wrote a simple Bash script to do the job, run it with CSV files as arguments.


You can run it with CSV files of different websites, it has no problem with that. Once the CSV files are processed, they are safe to remove. You only need to keep the first two files in the last file list in the screenshot above.

This script has a few predefined regular expression to filter out some common duplicate URLs, such as WordPress's and Blogger's archive or index-like pages. You really want to see is the posts which has link to your website in its content.

Thanks for the notification!

Here is the email I just received:

Dear Webmaster,

Your site, https://<mydomain>/, uses an SSL certificate which is not recognized by web browsers. This will cause many web browsers to block users from accessing your site, or to display a security warning message when your site is accessed.

To correct this problem, please get a new SSL certificate from a Certificate Authority (CA) that is trusted by web browsers.

Thanks,

The Google Web Crawling Team

Where <MYDOMAIN> is yjl.im. At first glance, I thought this was new kind of phishing but it's real, the message was also on Webmaster Tools.

First of all, I believe I have never written down http://yjl.im/ anywhere, needless to mention the one with HTTPS. So, I guess Google is very kind to check that for you. If it isn't not this email, I haven't thought about to check it.

And here is a screenshot of the certificate:


The naked domain has URL forwarding to www.yjl.im. I use my registrar's free service, so I have no control of it. If Google App Engine could operate on naked domain, I wouldn't need that.

I don't understand why their servers listen to HTTPS, that makes no sense. Anyway, I might turn off URL forwarding or see what my registrar would say about it or just forget the whole thing...

Google Webmaster Tools told me:

Performance overview

On average, pages in your site take 6.4 seconds to load (updated on Oct 31, 2010). This is slower than 83% of sites. These estimates are of low accuracy (fewer than 100 data points). The chart below shows how your sites average page load time has changed over the last few months. For your reference, it also shows the 20th percentile value across all sites, separating slow and fast load times.

http://4.bp.blogspot.com/-ddn_YW5Pl8I/T4faS3WuFRI/AAAAAAAADO8/myClQS47l8M/s800/chart.png

6.4 seconds?! Are you kidding me? Well, its not. My posts pages really take about that long to load. You might want to ask me, whats the problem with 6.4 seconds? Its not really long comparing to other blogs. Its not, but if you look at my page, you dont see any crap banners, icons, images. My blog is basically pure text only, except the images in post contents and two Google AdSense ad units. So, its too long to me.

I am pretty sure where I can cut the number down: Disqus!

https://farm5.staticflickr.com/4104/5212507835_8448c57ce2_o.png

If you scroll down, you will see Disqus is no longer loaded by default. You will need to click on that button to load comments. I also make related posts list loaded by user request, that helps a little, and I moved jQuery code into yjlv.js, request_count--. I know this is cheating in order to get low loading time.

Here is a diff for my template changes:

--- template.xml.orig       2010-11-28 08:11:25.000000000 +0800
+++ template.xml    2010-11-28 07:58:55.000000000 +0800
 -3,7 +3,7 
 <html xmlns='http://www.w3.org/1999/xhtml' xmlns:b='http://www.google.com/2005/gml/b' xmlns:data='http://www.google.com/2005/gml/data' xmlns:expr='http://www.google.com/2005/gml/expr'>
   <head>
     <b:if cond='data:blog.pageType == &quot;item&quot;'>
-      <title><data:blog.pageName/> &lt;&lt;&lt; $(<data:blog.title/>)</title>
+      <title><data:blog.pageName/></title>
       <b:else/>
       <b:if cond='data:blog.pageType == &quot;static_page&quot;'>
         <title><data:blog.pageName/> &lt;&lt;&lt; $(<data:blog.title/> --page)</title>
 -26,8 +26,8 
     <data:blog.feedLinks/>
     <b:skin><![CDATA[]]></b:skin>
     <link href='http://www.yjl.im/css/yjlv.css?4' rel='stylesheet'/>
M#-    <script src='http://ajax.googleapis.com/ajax/libs/jquery/1/jquery.min.js'/>
     <!--[if lt IE 9]><script src="http://html5shiv.googlecode.com/svn/trunk/html5.js"></script><![endif]-->
+    <script src="http://www.yjl.im/js/yjlv.js?8"></script>
   </head>
   <body>
     <header id='blog-header'>
 -253,7 +253,12 
             </div>
             <footer>
               <b:if cond='data:blog.pageType == &quot;item&quot;'>
-                <div style='float:right;width:312px'>        Possibly (Definitely Not) Related Posts:        <div id='gas-results'/>      </div>
+                <div style='float:right;width:312px'>
+                                   <div>Possibly (Definitely Not) Related Posts:</div>
+                                   <div id='gas-results'>
+                                           <input type="button" value="Click to load related posts list" onclick='$.getScript("http://brps.appspot.com/gas.js")'/>
+                                   </div>
+                           </div>
               </b:if>
               <div class='post-footer-line post-footer-line-1'>
                 <span class='post-author vcard'>
 -312,11 +317,19 
             </footer>
             <b:if cond='data:blog.pageType == &quot;item&quot;'>
               <section id='post-comments'>
-                <h2>Comments</h2>
+                <h2><a expr:href='data:post.url + &quot;#disqus_thread&quot;'>Comments</a></h2>
                 <div id='disqus_thread'/>
-                <script type='text/javascript'> if (document.location.href.indexOf(&#39;/b/post-preview&#39;) == -1) $.getScript(&#39;http://yjlv.disqus.com/embed.js&#39;);</script>
-                <noscript>Please enable JavaScript to view the <a href='http://disqus.com/?ref_noscript=yjlv'>comments powered by Disqus.</a></noscript>
-                <a class='dsq-brlink' href='http://disqus.com'>blog comments powered by <span class='logo-disqus'>Disqus</span></a>
+                           <script>
+                           $(function(){
+                                   // If visitors are led to comments, then load comments automatically.
+                                   var href = document.location.href;
+                                   if (href.indexOf('#disqus_thread') >= 0 || href.indexOf('#comment-') >=0) {
+                                           $.getScript("http://yjlv.disqus.com/embed.js");
+                                           $('#comments-loader-button').remove();
+                                           }
+                                   });
+                           </script>
+                           <input type="button" id="comments-loader-button" style="width:620px;margin:10px;" value="Click to load comments or to write a comment" onclick='$.getScript("http://yjlv.disqus.com/embed.js");$(this).remove();'/>
               </section>
             </b:if>
           </article>
 -639,4 +652,4 
       </div>
     </footer>
   </body>
-</html>
\ No newline at end of file
+</html>

I added a small piece of code which will automatically load comments when visitors come via a link like .../post-title.html#disqus_thread or .../post-title.html#comment-1234567. Visitors will have no problems to read the comment.

Also, I made a change to posts page titles. I removed <<< $(YJL --verbose) from title because I saw this in my Stats page:

http://farm6.static.flickr.com/5201/5212521441_ae24c47be9.jpg

The last keyword "yjl table" made me do so. That visitor must found nothing what he or she was looking for. "yjl" matched on page title, search engines are just not smart as you and me. If I was that visitor, I would not even click on the result since its clear that matched part is useless.

We will see if next check (after 2010-11-27T08:47:50-07:00) will resullt a significant drop on loading time, I believe it will.